Safe end-to-end imitation learning for model predictive control

نویسندگان

Keuntaek Lee

Kamil Saigol

Evangelos Theodorou

چکیده

We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Agile Off-Road Autonomous Driving Using End-to-End Deep Imitation Learning

We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost on-board sensors. By imitating a model predictive controller equipped with advanced sensors, we train a deep neural network control policy to map raw, high-dimensional observations to continuous steering and throttle commands. Compared with recent approaches to similar tasks, our method...

متن کامل

Implementation of Low-Cost Architecture for Control an Active Front End Rectifier

In AC-DC power conversion, active front end rectifiers offer several advantages over diode rectifiers such as bidirectional power flow capability, sinusoidal input currents and controllable power factor. A digital finite control set model predictive controller based on fixed-point computations of an active front end rectifier with unity displacement of input voltage and current to improve dynam...

متن کامل

Imitation Learning with THOR

The recently proposed House Of inteRactions (AI2THOR) framework [35] provides an simulation environment for high quality 3D scenes. Together with THOR, a Targetdriven model is introduced to improve generalization capabilities. Imitation learning or learning by demonstration is known to be more effective in communicating task. In this project, we extend the Target-driven model by exploring both ...

متن کامل

End-to-End Differentiable Adversarial Imitation Learning

Generative Adversarial Networks (GANs) have been successfully applied to the problem of policy imitation in a model-free setup. However, the computation graph of GANs, that include a stochastic policy as the generative model, is no longer differentiable end-to-end, which requires the use of high-variance gradient estimation. In this paper, we introduce the Modelbased Generative Adversarial Imit...

متن کامل

Universal Planning Networks

A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradie...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Safe end-to-end imitation learning for model predictive control

نویسندگان

چکیده

منابع مشابه

Agile Off-Road Autonomous Driving Using End-to-End Deep Imitation Learning

Implementation of Low-Cost Architecture for Control an Active Front End Rectifier

Imitation Learning with THOR

End-to-End Differentiable Adversarial Imitation Learning

Universal Planning Networks

عنوان ژورنال:

اشتراک گذاری